A Situated View of Representation and Control

نویسندگان

  • Stanley J. Rosenschein
  • Leslie Pack Kaelbling
چکیده

speci cations. 3.1.1 Maintaining Invariants Through Goal Regression Indirect methods de ne the action-selection mapping by deriving it from some description of the environment and the goal, whether in the form of an explicit combinatorial object like a graph, or in the form of declarative assertions, such as operator descriptions found in classical AI planning systems. To illustrate how a stimulus-response agent can be constructed algorithmically from an explicit description of an environment and goal, we consider the special case of agents that maintain invariants. Although the method illustrated does not scale well with large state sets, it does introduce important concepts and build up intuitions about properties of action strategies. A stimulus-response agent that maintains invariants can be synthesized as follows. Let the environment be represented as a nondeterministic automaton hS; P;A; init ; ; outi, where S is a nite set of states of the environment; P is a nite set of outputs (these are usefully viewed as percepts from the agent's perspective); A is a nite set of actions that the agent can generate as input to the environment; init is a set of states containing the one that the environment is known to be in initially; is a relation on S A S where (s1; a; s2) holds if it is possible for the world to make a transition from state s1 to state s2 when action a is generated by the agent; and out is a function mapping S to P For the simple pure-action case, we assume that the environment automaton outputs its full state as output. In other words, the percept set P is identical to the state set S and out is the identity function on states. Let the goal be represented by G, a subset of S, that the agent is to maintain as an invariant condition. A solution to this problem is G , a subset of G within which the environment can be made to stay inde nitely, and a mapping from G to A, specifying the actions the agent should take in order to stay within G . The set G can be computed iteratively, as follows: Let G0 := G For n = 0 to : : : Let Gn+1 = ; For all g 2 Gn If 9a:8g0: (g; a; g0)! g0 2 Gn then add g to Gn+1 When Gn = Gn+1, terminate and return Gn. 11 Each intermediate set Gn is the set of states from which G can be maintained for at least n steps. For any state g 2 G, if there exists an action such that from every possible successor state g0, G can be maintained for n steps, then in state g, G can be maintained for n+ 1 steps. This step is called \goal regression," because Gn+1 is the weakest precondition under which Gn can be made true on the next step (see Rosenschein [13] or Waldinger [17] for a more complete description of regression-based planning). When this process reaches a xed point, then we have determined the set, G of states from which G can be maintained inde nitely. In order to maintain G from some state g, the agent can do any action a such that 8g0: (g; a; g0)! g0 2 G . 3.1.2 Goal Regression Example Consider a simple domain in which a robot must keep plants alive by watering them. The action set of the robot contains actions to water each plant and no-op, the action that does nothing. The state of the world can be expressed as a vector describing the moisture level of each plant, where 4 is wet (just watered) and 0 is dead. For example, the vector (4 3 0) describes a situation in which the rst plant is wet, the second slightly drier, and the third is dead. Moisture decreases by one every time step on which the plant is not watered. Plants that die (reach moisture level 0) stay dead forever. We consider a situation in which there are three plants and the goal is to maintain the condition \no plants are dead"; G is enumerated below (with equivalent states under di erent orderings of plants deleted, because the identity of the individuals is irrelevant to the maintenance of this goal): (4 4 4) (4 4 3) (4 4 2) (4 4 1) (4 3 3) (4 3 2) (4 3 1) (4 2 2) (4 2 1) (4 1 1) (3 3 3) (3 3 2) (3 3 1) (3 2 2) (3 2 1) (3 1 1) (2 2 2) (2 2 1) (2 1 1) (1 1 1) The G resulting from the goal regression algorithm is: (4 4 4) (4 4 3) (4 4 2) (4 4 1) (4 3 3) (4 3 2) (4 3 1) (4 2 2) (4 2 1) (3 3 3) (3 3 2) (3 3 1) (3 2 2) (3 2 1) The most constrained state in G is (3 2 1). By watering the plant at moisture level 1, it is changed to (4 2 1), and then to (4 3 1), and then to (4 3 2). At this point, the no-op action is allowed, and the robot can rest, leading to the original starting state of (3 2 1). Note that there are no states in G with two plants at level 1 or three at level 2; although these states are in G (no plants are currently dead), they are not in G , because it is not possible to keep all the plants from dying in the future. 3.1.3 Discussion As mentioned above, this construction does not scale well as the number of environment states increases, and this motivates the use of other representations. Although ordinarily used to handle run-time goals of achievement, the declarative operator descriptions used in AI planning systems encode the same information as state-transition graphs, and can be used to drive the construction above. Operator descriptions provide a more intuitively 12 Perception f s Figure 4: The pure perception case interpretable form of expression and can often be manipulated more e ciently because they refer to large subspaces of the state space with terse symbolic labels. Rather than calculating G through enumeration, operator descriptions allow it to be calculated through symbolic regression. This may be more or less e cient that the alternative, depending on speci cs of the problem domain. This technique has been implemented and explored as an extension to the Gapps programming system [6]. 3.2 Pure Perception Until now we have assumed that inputs from the environment are su ciently informative, in that they encode all the world-state information needed to drive action. In cases where less information is available, the inputs to action selection must be derived by accumulating partial information over time, and for this purpose additional machinery is necessary. We refer to this additional machinery as the \perception system" and explore its properties in this section. As in the case of action selection, it will be useful to approach perception by beginning with a study of the pure phenomenon. By pure perception we mean agent-environment systems in which the outputs of the agent have no in uence on the environment at all, and the agent is simply a tracking system, or monitor: a passive observer, seeing, but not seen by, the environment. This special type of agent, again, will be of limited practical use but does illustrates the essential features of information extraction. The setup for pure perception is illustrated in gure 4. The lack of in uence of the agent on the environment cannot be depicted graphically; the environment's next state function is independent of the output of the agent. The focus in analyzing the perception system is on the kind of correspondence maintained between its internal states and states of the environment. This correspondence, in fact, is a form of invariant of exactly the type investigated in the previous section, but over the states of the agent-environment pair rather than just the environment. Even when the environment is indi erent to the actions of the agent, it makes sense to ask how the perception component might be designed to maximize the degree of correlation between its states and those of environment, hence maximizing its information. 13 To see this most clearly, consider an environment, modeled once again as a nondeterministic automaton hS; P;A; init ; ; out i. What is the maximum amount of information encoded in an instantaneous percept? In general, the best we can do is to associate with each percept p the set of environment states with which it is compatible (i.e., those s such that out(s) = p). What is the maximum amount of information about the environment that could be accumulated by the agent automaton over time? Given a rich enough inventory of internal states, a pure perception agent could optimally track the environment by having states isomorphic to the powerset of environment's states. Let = powerset(S) be the set of internal states of the agent, with the agent in state i if and only if every world state s 2 i is consistent with the agent's perceptual history so far. The agent's initial state is the set of possible initial states of the environment, init, and its transition function N( ; a; p) which maps the previous internal state , the last action a, and the last percept p, into a new internal state, is given by N( ; p; a) = fs0j9s 2 : (s; a; s0) ^ out(s) = pg : This powerset automaton might be cumbersome, indeed, but its tracking behavior would be optimal. Although as the number of environment states rises, the powerset construction quickly becomes infeasible, it is useful as a thought experiment because much of its value can be preserved through e cient but information-rich approximations. Mathematically, these approximations are homomorphic images of the ideal powerset automaton, and thus are consistent with, but not as complete as, that ideal, or optimal, tracker. Nevertheless, these homomorphic images allow useful information to be monitored, while carefully trading o computational space and time, under the designer's control. One simple approach to constructing homomorphic projections of the powerset automaton is to choose a set of interesting or signi cant states in the powerset automaton, and close these under union and intersection. The result is a lattice, which will be a sub-lattice of the powerset Boolean algebra. The construction of the initial state and transition-function of the perception system then proceeds as in the case of the powerset automaton above, but with the true powerset elements approximated by least upper bounds in the sub-lattice. For example, if in the original powerset automaton the transition function maps a state to a successor state that is not an element of the homomorphic-image lattice, the element of that lattice which best approximates the successor state will be returned instead. Thus the lattice transition function approximates the optimal transition function and degrades gracefully with the precision of the representation. The lattices themselves would typically be cartesian products of simpler lattices, with elements that could be represented compactly as parameter vectors. This technique forms the basis of the ruler system [15]. ruler takes an approach analogous in many ways to AI planning systems. In ruler, the environment is described by a set of assertions, including temporal assertions that describe conditions that are either true initially or that will be true in the next state, depending on current conditions. The ruler compiler synthesizes perceptual machinery (an initial state and next-state function) by chaining together these individual assertions, not with a view toward constructing action sequences, but rather with a view toward computing descriptive parameters in the next state's world model. The use of lattices as the semantic domain of interpretation of the model parameters, along with e ectively closing the parameter space under intersection, allows 14 incremental information to be folded in nicely and leads to a compositional methodology for constructing perceptual update mechanisms. 3.2.1 Ruler Compilation ruler's compilation method works as follows. The compiler takes as input a description of information carried by the run-time inputs to the program and the internal state variables, as well as a background theory containing temporal facts about the world. The compiler operates by deriving theorems about what is true initially and about what will be true at any time, given what was true at the previous time. In the course of the derivation, free variables are instantiated in the manner of logic programming systems. From the instantiated formulas, the compiler extracts a program for initializing and updating a state vector with the desired informational properties. More precisely, the compiler's inputs consist of the following: a list [a1; :::; an] of input locations a list [b1; :::; bm] of internal state locations for each input location a, a formula Pa(U) with free variable U for each internal location b, a formula Pb(U) with free variable U and a function rconjb a nite set of facts. The formulas Px(U) express propositions parameterized by U , where U ranges over runtime values of location x; for example, Pb4(6) might denote \current soil mosture level 6." These values are drawn from a lattice so that degrees of partial information can be represented. The rconj operations are binary functions that take a pair of lattice values and combine them into a single lattice value summarizing their conjunctive content as precisely as possible. (The rconj operation extends rconj to sets of lattice values in the natural way.) Using formulas in this way, the propositions that were merely implicit in the information of the machine can be made explicit and manipulated by the compiler. For each internal location b, the compiler computes two sets of runtime value terms Ib and Nb de ned as follows (the 2 symbol is the temporal logic operator representing \necessarily always"): Ib = fe j ` 2init Pb(e)g Nb = fe0 j ` 2(Pa1(a1) ^ : : : ^ Pbn(bn)! next Pb(e0))g ; where e0 = f([a1; : : : ; an]; [b1; : : : ; bm]). If we are initially ignorant of soil moisture, we might have only 2Pb4(0), so Ib4 = f0g. If our lower bound on moisture decreases 1 per time step, then we might have 2(Pb4(n) ! nextPb4(n 1)). Each set Ib contains terms representing properties that can be proved from the background theory to hold initially in the world. Each set Nb contains terms for properties that can be proved to hold \next," given the properties that hold now as represented by the values of the input and state locations. If these sets are in nite, they can be generated and used incrementally. This is discussed more fully below. 15 From these collections of sets the compiler computes the initial value of the state vector, v0, and its update function, f . The initial value is computed as follows: v0 = [rconj b1(Ib1); : : : ; rconj bm (Ibm)]; In other words, the initial value of the state vector is the vector of values derived by rconj-ing values representing the strongest propositions that can be inferred by the compiler about the initial state of the environment in the \language" of each of the state components. Similarly, for the next-state function: f([a1; : : : ; an]; [b1; : : : ; bm]) = [rconj b1(Nb1); : : : ; rconj bm (Nbm)]: Here the compiler constructs a vector of expressions that denote the strongest propositions about what will be true next, again in the language of the state components. In the case of the initial value, the rconj values can be computed at compile time because the values of all the arguments are available. In the case of the next-state function, however, the rconj terms will not denote values known at compile time. Rather, they will generally be nested expressions containing operators that will be used to compute values at run time. Assuming the execution time of these operators is bounded, the depth of the expressions will provide a bound on the update time of the state vector. Without restricting the background theory, we cannot guarantee that the sets Ib and Nb will be nite. However, even in the unrestricted case the niteness of terms in the language guarantees that whichever elements we can derive at compile time can be computed in bounded time at run time. Furthermore, the synthesis procedure exhibits strongly monotonic behavior: the more elements of Ib and Nb we compute, the more information we can ascribe to run-time locations regarding the environment. This allows incremental improvements to be achieved simply by running the compiler longer; stopping the procedure at any stage will still yield a correct program, although not necessarily the program attuned to the most speci c information available. Since, in general, additional rconj operations consume runtime resources, one reasonable approach would be to have the compiler keep track of run-time resources consumed and halt when some resource limit is reached. As we have observed, without placing restrictions on the symbolic language used to specify the background theory , the synthesis method described above would hardly be practical; it is obvious that environment-description languages exist that make the synthesis problem not only intractable but undecidable. However, as with Gapps and other formalisms in the logic programming style, by restricting ourselves to certain stylized languages, practical synthesis techniques can be developed. We have experimented with a restriction of the logical language that seems to o er a good compromise between expressiveness and tractability. This restriction is to a weak temporal Horn-clause language resembling Prolog but with the addition of init and next operators. The derivation process proceeds as described above using backward-chaining deduction techniques as the speci c form of inference. A prototype system has been built implementing the Horn-clause version of the synthesis algorithm. One of the ways the language di ers from Prolog is in the strong distinction between compile-time and run-time expressions. Compile-time expressions undergo uni cation in the ordinary manner; run-time expressions, by contrast, are simply accumulated and used to generate the nal program. The 16 ruler system was run on several small examples involving object tracking and aggregation, and the synthesis procedure has proved tractable in our test implementation. Using o -line synthesis techniques, conditions that are semantically complex can still be recognized with limited machinery, and for this reason it is entirely consistent with the \reactive" bias to admit sophisticated semantic information and models. With some care, the designer can have the best of both worlds: declarative forms can be used to clarify the semantics of the domain representation, and nite parametric representations can be generated by the compiler to guarantee bounded-time updates and real-time response. 3.2.2 Ruler Example This section sketches out a simple example of a pure-perception system synthesized by the ruler system. Imagine that we again have a plant-watering robot, but we are now concerned with constructing its perceptual system so that it maintains, at all times, as much information as it can about the moisture level of a collection of plants. The representation used by the system must be able to accommodate uncertainty, so we use an interval, representing known lower and upper bounds on the true moisture level of the plant. This gives us our rst rule, moisture(p,[0, max]). which states that the moisture of plant p is always between 0 and some maximum level. Additionally, if the robot is at the plant, it can get an approximate reading of the moisture level from its sensor: moisture(p,[v-1,v+1]) :at plant(p,1), moisture sensor(v). The at plant(p,1) term requires that the robot know that it is at plant p at the time the moisture is being sensed. In this case, there is an input bit, a, such that at plant(n,a). The robot has been constructed in such a way that if a has value 1, then the robot is known to be at plant n; if it has value 0, the robot is not known to be at that plant. We will treat other propositions similarly. The dynamics of the world are speci ed in terms of next rules. If we know that the robot's last action was not to water the plant (either because we know it didn't water or because we know it wasn't at the plant), then the moisture may either increase (perhaps due to rain) or decrease by 1: next moisture(p,[l-1,h+1]) :not watering(1), moisture(p,[l,h]). next moisture(p,[l-1,h+1]) :not at plant(p,1), moisture(p,[l,h]). If we know that the robot did just water the plant, then the moisture will increase to its maximum level: 17 next moisture(p,[max,max]) :at plant(p,1), watering(1). If we don't know whether the robot watered the plant (either because we don't know whether it watered or because we don't know whether it was at the plant), the bounds spread quickly: next moisture(p,[l-1,max]) :moisture(p,[l,h]). Note that the last rule does not con ict with other rules that provide tighter bounds on the moisture. We combine the results of these rules by specifying an rconj rule for moisture. In this case, it is simply to intersect the intervals. Running ruler on this set of rules results in a circuit that retains as much information as possible about the moisture of the plant, given its inputs and the speci ed representation. 3.2.3 Objects, Properties, and Relations While conceptually adequate for generating provably correct perceptual subsystems, at least for nonprobabilistic domain models, ruler is limited in that it makes no special provision for modeling worlds in which objects and their properties and relations are of special importance. This is the case, for example, in visual perception where objects move in and out of view, and a prime form of information to be extracted from the scene concerns the identity of objects and their spatial relations to one another and to the observer. To begin to address domains of this type, we developed an information-update schema we named Percm. The Percm schema can be thought of as a specialized form of ruler in which a nite, but shifting, set of objects is being tracked and described. The descriptions are represented as labeled graphs, with node labels representing unary properties of objects, and edge labels representing binary relations between objects. One of the objects is the agent, and the rest of the objects can vary, moving in and out of attentional focus. This scheme bears some relationship to the indexical-functional representations developed by Chapman and Agre for Pengi, but with rigorous correlation-based semantics. The node and edge labels are drawn from a space of data values representing lattice elements, just as in the ruler case, only now the propositional matrix is xed (i.e., a xed conjunction of properties and relations) and the lattice elements are constrained to be of semantic type property or relation, or to be coercible to such values. Figure 5 shows the basic runtime data structures that underlie a Percm with n elements. There is a vector of length n, each of whose elements contains the unary properties of the ith element being tracked. Often, index 1 is reserved for the agent. In addition, there is an n n matrix in which cell hi; ji contains the strongest representable information available about the relation between objects i and j. In many cases, the relations will be symmetric (or canonicalizable) so that only the upper triangle of the matrix needs to be explicitly represented. The update cycle for this data structure is similar to ruler's, but in the Percm context, xed background descriptions of the environment are provided not in the form of propositional assertions about world-state transitions, but rather as rules, both temporal and 18 R11 Rnn Rn3 Rn2 Rn1 R3n R33 R32 R31 R2n R23 R22 R21 R1n R13 R12 P1 Pn P3 P2 Properties Relations Figure 5: Data structures supporting an instance of the Percm schema with n objects atemporal, for computing object properties and relations. This information is built into a set of operations used to update the data structures. These operations are: create: maps an input value to initial object properties and relations inferable from that input; propagate: strengthens properties and relations among objects x and y by deriving what can be inferred from existing properties and relations between each of x and y and some third object, z; merge: combines descriptions of objects x and y if their properties and relations imply that they are identical; aggregate: creates a new object y whose existence can be inferred from the existence of constituent objects x1; :::; xn with appropriate properties and relations, and initializes y's description based on descriptions of constituents. degrade: maps properties and relations at time t to new values inferable for time t+ 1; The perceptual system is synthesized by composing and iterating these operations to update the object descriptions, with values again drawn from lattices to obtain gracefully degrading approximations. Because Percm is a nite schema of bounded size, to complete the speci cation of an instance of the Percm schema, the designer must also de ne how, in the case of object over ow, objects are to be discarded or withdrawn from active attention. Circuitry to keep the data structures updated can be large, but is of bounded size. Operations like nding an empty cell for a new object can be done in a very shallow circuit with size O(n). 19 3.2.4 Percm Example In order to illustrate the ideas behind the Percm schema, we present a simple example of its operation. A mobile robot, traveling through a new environment, needs to construct a representation of the salient objects and their spatial relations. The robot might begin by perceiving, instantaneously, that there are two objects in front of it: a chair and a person. It creates two objects, assigning them indices 2 and 3, stores some of their unary properties (such as type and color of the chair, gender and hair-color of the person) in cells P2 and P3, and stores bounds on the spatial relations between each one and the robot in R1;2 and R1;3. Immediately, a propagate operation can compute bounds on the spatial relations between objects 2 and 3 and store it in cell R2;3. These objects can be neither merged nor aggregated. Finally, in the degrade step, knowledge about the generic motion abilities of chairs and people, as well as the current motion of the robot, is used to degrade the spatial relation information. The robot typically has good local odometry (motion information), so it knows how much it has moved relative to the position it was in when it rst perceived these objects and can update R1;2 and R1;3 accordingly. If both of these objects were static, the robot could wander away and become confused about its relation to the objects, but still retain precise information about the relation of the objects to each other. However, in this case, people are far from static, so the degrade step will increase the bounds on all spatial relations between the person and other objects, because the person could potentially move in any direction. On the next cycle, the robot again sees the person, but because of its changed perspective, is able to measure the person's height. This person gets created as object 4 in the Percm data structures. This time, on the merge step, the robot is able to infer that, because of their close spatial positions (and perhaps because two people were not seen simultaneously), that objects 3 and 4 must really be the same. They are merged by conjoining their properties and their relations to other objects and storing them in a single index. The other index is marked as free. Now, the height and hair color are both known about a single person. The aggregate operation is useful when entire complex objects cannot be perceived instantaneously. Thus, a robot attempting to identify a large truck might individually identify wheels, a cab, and a at bed, then aggregate them into a truck object. As the data structures begin to get full, it will be important to purge items in a useful way. Objects may be purged because their information is weak, or they are superseded by a complex object, or for a variety of attentional reasons based on the robot's current goals. 3.3 Combined Perception and Action The techniques illustrated in the two previous sections can be combined directly to synthesize control systems containing both perception and action components. For instance, using the Gapps approach, one could develop mappings from information states to actions, where the information states are the output of a perceptual subsystem synthesized using the ruler or Percm methodologies. If there were no interactions among design decisions needed for the two subsystems, the de nition of the information state of the agent would act as a clean interface, and the combined system would exhibit the intended behavior. In general, however, there are interactions, and in this section we explore the nature of those interactions 20 and potential methods of dealing with them. The rst problem is the speci cation of the information-state interface between the two modules. This problem exists even when the perceptual mechanism is degenerate. It is possible that the perceptual inputs from the environment do not provide enough information for the goal to be satis ed. Another di culty arises if the information is available, but is encoded in such a way that the localization machinery is of intractable complexity. The design of the system becomes much more complex when the actions taken by the agent in the environment a ect the information that will be available to it. When choosing action strategies, attention must be given to how actions chosen now will maintain the ow of information necessary for distinguishing among future states to be acted on. In AI, this problem often goes under the label of the knowledge precondition problem [11]: it is not enough to be in an environmental state when a certain action is appropriate; the agent must know that it is in an environmental state in which that action is appropriate. The problem grows more complex when perceptual machinery distills information contained in the sensory input stream, and still more complex when the goal itself pertains to a ecting the agent's own information state. In these cases, the internal structure of the perception module is, from the point of view of the action-selection module, part of some external environment whose dynamic properties are critical to the success or failure of its strategy. Unfortunately, without elaborating the internal structure of the perception module rst, statements of fact about this environment cannot be made, and hence no valid action strategy can be chosen. In general, action strategies intended to satisfy information goals can only coherently be developed in the context of xed perceptual machinery, or, at least, in the context of articulated assumptions about the perceptual machinery. A natural development methodology, then, would be to design the perception module rst, choosing conditions to be tracked and de ning update circuitry that tracks these conditions in the passive sense introduced in the previous section, but does not guarantee the input streams that will force it to the right state. After de ning this xed machinery, an action strategy can be de ned, relying on the de nition of the perception component as if it were part of the environment. This strategy is designed to cause input streams owing into the perception component to drive it into the appropriate states and actively makes use of constraints imposed by the previously chosen structure of the perception module. In principle, when perception and action modules are generated from declarative domain descriptions, a single set of facts about the environment should su ce to generate both modules. In other words, ruler-like state transition rules, combined with operator-description-like action descriptions, contain enough constraints to generate systems that seek information. The ruler rules generate a perceptual system that maintains, as an invariant, correlations with conditions that the action system needs to test. This approach can involve a search, albeit at design time, for suitable conditions that can be e ectively tracked. In all of these approaches, the result is an automaton with an objective informational relation to its environments. This is unlike the usual case in AI, in which knowledge preand post-conditions have been analyzed using theories that link internal states of agents to their environment only through stipulated semantic-denotation relations attributed by designers somewhat arbitrarily to symbolic data. This distinction is substantial, and it is encouraging that many of the same semantic desiderata that have been pursued in traditional AI planning and representation systems can be achieved in a more mechanistic, and potentially far more 21 e cient, control-theoretic setting. A nal area of complexity is the pervasive uncertainty found in natural environments. Throughout this work, we have modeled uncertainty using simple nondeterminism. While this allows designers and machines to avoid committing to information they do not possess, these models are extremely conservative in that they regard all alternative states that are not ruled out by hard constraints to be of equal importance. In real task domains, however, some of those alternatives are far more likely than others, and this fact is essential to the proper exploitation of the information. A model that is midway between deterministic and nondeterministic models is the probabilistic model in which state transitions, under a given input, are described by probability distributions. A natural mathematical model for such systems is the Markov process, which has been studied extensively by applied mathematicians. The di culty in using probabilistic models together with the symbolic techniques described above is the nonmonotonicity of probabilities, which leads to noncompositionality of the design technique. By conditioning on further evidence, the probability of a proposition can either be reduced or increased. This means that a designer cannot, in general, de ne a module of the perceptual component, prove a strong statement about the semantics of its outputs, and then proceed to use that module together with other modules; conditioning on the joint states of the modules may completely undermine the intended semantics of the rst module. Furthermore, the action strategy embodied in the action-selection component is integral to the de nition of the probabilistic state-transition matrix of the entire system. Just as before when we could not, in principle, de ne an action strategy before providing a xed de nition of the perception component, here we cannot de ne the perception component without constraining action rst. The apparent circularity only points to the fundamental need to consider the agent as an integrated whole; the behavior of the entire system, agent plus environment, is determined only when all the boundary conditions have been speci ed. Interim constraints and incremental re nement may be useful, but must be used cautiously, especially when modeling domains probabilistically. The theory of partially observable Markov decision processes [9, 1] provides a theoretically well-founded methodology for deriving controllers in stochastic domains, but it seems to be computationally very intractable. 4 Conclusions The aim of situated-automata theory is to provide a new semantic perspective on intelligent agents. Traditional AI has been dominated by \reasoning" metaphors drawn from folk psychology in which programs are seen as actors manipulating linguistic elements, drawing conclusions from premises, and constructing representations of action. The semantics of these systems have been made rigorous, but are almost always imposed by their designers. Moreover, traditional models have often failed to explain how so much \reasoning" can get done so fast with so little hardware. Reactive-agent architectures have been proposed as an alternative to traditional AI, but to date theoretical foundations for this work have been less developed. Situated-automata theory provides a semantic analysis of information processing that 22 is intended to apply to all embedded control systems without requiring designer-conceivedinterpretations of machine state or computational models based on run-time inference. It isbased on a direct mathematicalmodel of how the states of natural processes, in the ways theyunfold over time, re ect one another through intricate cross dependencies and correlationsthat give rise to semantically meaningful states. The theory brings the semantic precisionassociated with traditional logic-based AI to the analysis of systems that are not structuredas conventional reasoning systems at all. Nor are systems that do seem to \reason" excludedfrom this style of analysis; they are simply a special case.Note that none of this analysis is inconsistent with the construction of agents as symbolicsystems; it simply makes explicit the constraints that must hold for their intended interpre-tation to be valid and provides methods for using symbolic characterizations as programspeci cations rather than as an implementation strategy.The shift from the traditional AI view to the situated view brings us to an outlookreminiscent of early cybernetic feedback models, but with more semantic subtlety and so-phistication (derived from traditional symbolic AI) in describing conditions being trackedand controlled by an agent. In this view, the fundamental phenomenon to be explained isnot \reasoning" but the mutual constraint exhibited between parts of a physical system overtime. The key lies in understanding how a process can naturally mirror in its states subtleconditions in its environment and how these mirroring states ripple out to overt actionsthat eventually achieve goals. The fundamental questions include how the enormous setof discernible conditions can be modeled and grasped, how computational elements can bearranged to preserve distinctions that matter for controlling the environment while perhapsblurring others, and how can this be done in real time with high reliability using relativelymodest computational resources.While the analytical approach presented here is very general in scope, its application tosynthesis problems and to the design of particular systems remains quite challenging. In thispaper we have attempted to sketch directions we regard as promising, primarily involvingthe use of stylized o -line symbolic reasoning to generate tractable run-time machinery withdesired properties of information and control. Work remains to be done on the integrationof automated learning techniques as well as the modeling and exploitation of statisticalcovariance in ways analogous to the discrete logic-based techniques presented here.AcknowledgmentsWe derived a great deal of help and inspiration from our colleagues over the years of thisproject. Stanley Reifel built Flakey, an experimental mobile robot platform, and constantlychallenged us to match in working software what we derived in elegant formulas. SandyWells brought a knowledge of computer vision, hardware, and hacking that was invaluable.Nathan Wilson implemented endless versions of and variations on Rex and wrote some cru-cial navigational code for Flakey. Stuart Shieber was a valuable adjunct to the group andimplemented natural language modules for the robot programs. Fernando Pereira was animportant in uence on the early development of situated-automata theory. David Chapmanspent some summers with us and helped make Rex a much better language, through bothideas and implementation. He also worked on Ruler and some of its precursors. We aregenerally indebted to and appreciative of our colleagues at the Arti cial Intelligence Center23 of SRI International, at Stanford University's Center for the Study of Language and Infor-mation, and at Teleos Research. We gratefully acknowledge nancial support from theseinstitutions as well as from sponsors at the Defense Advance Research Projects Agency, theNational Aeronautics and Space Administration, the Air Force O ce of Scienti c Research,General Motors Research, and FMC.References[1] A. R. Cassandra, L. P. Kaelbling, and M. L. Littman. Acting optimally in partiallyobservable stochastic domains. In Proceedings of the Twelfth National Conference onArti cial Intelligence, Seattle, WA, 1994.[2] J. Halpern and Y. Moses. Knowledge and common knowledge in an distributed en-vironment. In Proceedings of the Third ACM Conference on Principles of DistributedComputing, pages 50{61, 1984. A revised version appears as IBM RJ 4421.[3] G. E. Hughes and M. J. Cresswell. An Introduction to Modal Logic. Methuen andCompany, London, 1968.[4] L. P. Kaelbling. Rex: A symbolic language for the design and parallel implementation ofembedded systems. In Proceedings of the AIAA Conference on Computers in Aerospace,Wake eld, Massachusetts, 1987.[5] L. P. Kaelbling. Goals as parallel program speci cations. In Proceedings of the SeventhNational Conference on Arti cial Intelligence, Minneapolis-St. Paul, Minnesota, 1988.[6] L. P. Kaelbling. Compiling operator descriptions into reactive strategies using goalregression. Technical report, Teleos Research, Palo Alto, California, 1991.[7] L. P. Kaelbling and S. J. Rosenschein. Action and planning in embedded agents. Roboticsand Autonomous Systems, 6(1):35{48, 1990. Also published in Designing AutonomousAgents: Theory and Practice from Biology to Engineering and Back, Pattie Maes, editor,The MIT Press/Elsevier, 1991.[8] L. P. Kaelbling and N. J. Wilson. Rex programmer's manual. Technical Report 381R,Arti cial Intelligence Center, SRI International, Menlo Park, California, 1988.[9] W. S. Lovejoy. A survey of algorithmic methods for partially observed markov decisionprocesses. Annals of Operations Research, 28(1):47{65, 1991.[10] J. McCarthy and P. J. Hayes. Some philosophical problems from the standpoint ofarti cial intelligence. In B. Meltzer and D. Michie, editors, Machine Intelligence 4.Edinburgh University Press, Edinburgh, 1969.[11] R. C. Moore. A formal theory of knowledge and action. In J. R. Hobbs and R. C.Moore, editors, Formal Theories of the Commonsense World. Ablex Publishing Com-pany, Norwood, New Jersey, 1985.24 [12] A. Newell. The knowledge level. Arti cial Intelligence, 18:87{127, 1982.[13] S. J. Rosenschein. Plan synthesis: A logical perspective. In Proceedings of the SeventhInternational Joint Conference on Arti cial Intelligence, Vancouver, British Columbia,1981. Reprinted in Readings in Planning, J. Allen, J. Hendler, and A. Tate, eds., MorganKaufmann, 1990.[14] S. J. Rosenschein. Formal theories of knowledge in AI and robotics. New GenerationComputing, 3(4):345{357, 1985.[15] S. J. Rosenschein. Synthesizing information-tracking automata from environment de-scriptions. In Proceedings of Conference on Principles of Knowledge Representation andReasoning, Toronto, Canada, 1989.[16] S. J. Rosenschein and L. P. Kaelbling. The synthesis of digital machines with provableepistemic properties. In J. Halpern, editor, Proceedings of the Conference on Theo-retical Aspects of Reasoning About Knowledge, pages 83{98. Morgan Kaufmann, 1986.An updated version appears as Technical Note 412, Arti cial Intelligence Center, SRIInternational, Menlo Park, California.[17] R. Waldinger. Achieving several goals simultaneously. In Machine Intelligence 8. EllisHorwood Limited, Chichester, 1977. Reprinted in Readings in Planning, J. Allen, J.Hendler, and A. Tate, eds., Morgan Kaufmann, 1990.25

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Situated Representation

Preface During my M.Sc. research I studied the behaviour of a model of a rat learning to reach a goal position in a maze. The virtual rat exhibited several intelligent strategies to reach the goal position. Being a na¨ıve cognitive psychology student I did what most researchers in psychology do; analyse the rat's artificial brain for clues on the origins of its intelligent behaviour. After mont...

متن کامل

Remuneration of Non-Executive Independent Directors Review the View of Representation Theory

Management is trying to maximize your rewards and that means in terms of net profit, return on investment (performance) or other accounting measures and also by trying to Making positive changes in the prices of corporate securities to be done. In other words, the maximum managers Their interests are trying to improve corporate performance and the improvement of the capital Investors will be aw...

متن کامل

Towards an IntegratedKnowledge - BaseManagement System Overview of R & D on Databases and Knowledge

Knowledge representation languages and knowledgebases play a key role in knowledge information processing systems. In order to support such systems, we have developed a knowledge representation language, QUIXOT E, a database management system, Kappa, as the database engine, some applications on QUIXOT E and Kappa, and two experimental systems for more exible control mechanisms. The whole system...

متن کامل

Representation of Tehran during Qajar reign (Naseri period) in French travelers’ travel journals

Travel journals from different historical periods contain valuable insights that could be analyzed from different aspects and through that we can understand other people’s points of view about us. Description of different cities in travel literature represents French travelers’ views about these cities besides informing us about the quality and quantity of those cities at the times. “Represen...

متن کامل

Does Representation Need Reality? Rethinking Epistemological Issues in the Light of Recent Developments and Concepts in Cognitive Science

á This paper discusses the notion of representation and outlines the ideas and questions which led to the organization of this volume. We argue for a distinction between the classical view of referential representation, and the alternative concept of system-relative representation. The latter refers to situated cognitive processes whose dynamics are merely modulated by their environment rather ...

متن کامل

Manifest Destiny and American Identity in Cormac McCarthy’s Blood Meridian

McCarthy scholarship has predominantly tended to stress the writer’s revisionism with regard to his rendering of the myth of the American West in Blood Meridian (1985). McCarthy’s novel has beenmainlyhailed as a critique of the violence of manifest destiny. This study aims to delineate aspects of McCarthy’s narrative which resist the predominant view of him as a revisionist. In this re...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Artif. Intell.

دوره 73  شماره 

صفحات  -

تاریخ انتشار 1995